This notebook accompanies the article entitled:
Causal inference with multiple versions of treatment and application to personalized medicine
It contains the investigation on PDX data referenced as section 5 of the article.
We want to perform a quantitative evaluation of precision medicine (PM) strategies with observational data. So we will emulate target trials in the potential outcomes framework in order to estimate our causal effects of interest.
Target trials to estimate causal effect of precision medicine (PM) algorithm versus different controls. Patients are first screened according to their eligibility for the algorithm: based on their genomic characteristics patients are recommended a specific treatment (eligible) or not (no eligible). Then eligible patients are randomized and assigned either to PM-directed arm or to one of the alternative control arms (CE_1, CE_2 or CE_3)
We define 3 different target trials comparing a precision medicine arm with 3 different control arms therefore defining 3 different causal effects CE1/CE2/CE3.
Our first causal effect of interest quantifies the effect of PM versus a simple control, called CE1 in this document. In practice, this causal effect corresponds to the expected gain compared a single-version reference/standard of care. This reference treatment can also be understood as the absence of treatment in some cases.
Our second causal effect of interest quantifies the PM improvement compared to the treatments effectively given in the real cohort, i.e. the physician’s choice, called CE2 in this document. We assume that treatment assignments in observational data have their own rationale and we compare our PM algorithm to this rationale. In practice, this causal effect compare the validity of treatment assignment in the observational data and the potential re-assignement of treatments that would have been performed by the PM algorithm.
Our third causal effect of interest is to quantify the relevance of treatment assignment, i.e the potential improvement of PM treatment assignement compared to a random assignment of the same versions of treatment, called CE3 in this document.
All these effects are defined only for PM-eligible patients, i.e for patients whose mutations results in a personalized treatment recommendation. For non-eligible patients it does not make sense to quantify the impact of the PM algorithm and to compare it with a control.
We will use the potential outcomes framework to estimates the causal effects described in previous section with observational data. Below, we summarize very briefly the variables we use to model our precision medicine settings. Please refer to the article for a detailed description of potential outcomes framework, counterfactual variables and the impact of the multiplicity of versions of treatment.
Causal diagram in Precision Medicine
We have:
We will base our analysis on a dataset of Patient-Derived Xenografts (PDX) from Gao et al.. These are patients tumours implanted in mice. Since you can implant several pieces of the same tumour in several mice, you have the opportunity to test different drugs on the same tumour (same patient of origin). In a way, we have access to some counterfactuals values. It may help us to evaluate our ability to recover real causal effects with dedicated statistical methods.
Each patient has been screened for different drugs (not the same subset of drugs for all patients). The response was determined by comparing tumor volume change at time \(t\) to tumor volume at time \(t_0\). Several metrics are computed:
\(TumourVolumeChange(\%) = \Delta Vol_t = 100\% \times \dfrac{V_t-V_{t_0}}{V_t}\)
\(BestResponse = min(\Delta Vol_t), t>10d\)
\(AvgResponse_t = mean(\Delta Vol_i, 0 \leq i\leq t)\)
\(BestAvgResponse = min(AvgResponse_t), t>10d\)
We will mainly focus on \(BestAvgResponse\). This metric “captures a combination of speed, strength and durability of response into a single value”. Qualitatively, lower values correspond to more efficient drugs.
We also define the \(ResponseCategory\) provided and based on mRECIST criteria. We re-process the data and define a binary variable \(ResponseBin\) which is 0 when the combination tumour-drug experienced a progressive disease and 1 otherwise (complete response, partial response or stable disease).
Last, but not least, we have omics profiles for many of these tumours with information on mutations/CNA and RNA. Only mutations/CNA are used in the present example.
Data is imported from Supplementary Material of the paper. Tissue of origin of the tumour is recovered from Xeva R package.
## [1] "Done"
Before stdying precision medicine trials with PDX data, we provide below a very generic description of the whole PDX dataset for readers who would like to familiarize themselves with it and have a more global vision.
The dataset is composed of 281 PDX models and 63 have been tested. Nevertheless, the data matrix is quite sparse since not all drugs have been tested for all patients.
Some have been tested in a comprehensive subset of PDX models (LEE011, binimetinib…) but most of them have been tested accordig to cancer-specific patterns.
Let’s first observe the differences between untreated and treated patients (whatever treatment they receive).
Treated tumours have, in average, decreased volume. Besides, response metric when treated and untreated are significantly correlated, supporting the hypothesis of a latent tumour aggressiveness factor.
Assuming \(BestAvgResponse\) when untreated is a good proxy for Aggressiveness, is it evenly distributed among cancer tissues?
Analysis per tissue, based on binary Responder status:
All in all, CM tumours appear a little bit more aggressive.
Now, what about difference between drugs?
Drug combinations are highly represented in the most effective drugs. All in all, there is a good “overlap” between drugs effects distribution: we do not observe completely distinct distributions with some drugs being systematically more efficient than others for all PDX models.
Can we go even further and say that all drugs have at least a few patients for whom they prove to be the best?
Not exactly. Some drugs (and especially combinations) win the gold medal more often but still the landscape is quite diverse and there is no panacea! It supports the fact that the best strategy is not to treat all patients with the same drugs but rather to adapt drugs to patients therefore supporting a precision medicine approach.
Let’s now define our precision medicine treatment strategy, PM1:
PTEN is also included in the genomic covariates of the model since it has been identified as a relevant predictor. LEE011 drug is considered as our standard non-targeted drug.
Now we will study the clincal impact of our PM algorithm on PDX models. To do this we focus the analysis on models eligible to this PM algorithm (i.e. mutated for BRAF/KRAS or PIK3CA) and with full data availibility (i.e. responses for binimetinib, BYL719 and LEE011 drugs).
This reduced cohort contains 88 patients. We plot below the distribution of tissues and biomarkers.
Let’s first check whether the algorithm has been designed in a meaningful way, consistent with the data.
Treatment assignment algorithm and observed drug sensitivities are consistent since mutated BRAF/KRAS tumours have a better binimetinib response and mutated PIK3CA tumours have a better BYL719 response. In addition, it can be noted that these biomarkers have deleterious cross-effects.
We can also have a look at some treatment strategies variables in order to observe the agregated picture
We sample 1000 cohorts of 70 patients (out of 88) and randomly assigned observed treatments for each patient. Then we compute the different estimates for all cohorts.
## ================================================================================
## [1] "Computation of estimates done"
We sample 1000 cohorts of 70 patients (out of 88) and randomly assigned observed treatments for each patient. Then we compute the different estimates for all cohorts.
## ================================================================================
We can reproduce very similar analyses replacing the continuous outcome by a binary one.
This reduced cohort still contains 88 patients with the same distribution of tissues and mutations
What is the global landscape of drug sensitivities in this reduced cohort?
Similarly, we sample 1000 cohorts of 70 patients (out of 88) and randomly assigned observed treatments for each patient. Then we compute the different estimates for all cohorts.
## ==============Error in glm.fit(x = structure(c(1.39909676787471, 0, 1.39909676787471, :
## NA/NaN/Inf in 'y'
## ==================================================================
## [1] "Computation of estimates done"
## [1] "Computation done in:"
## 8396.022 sec elapsed